This lecture is to introduce the elements in R programming language that are relevant to decision model and decision analysis.
vector, matrix, array, list, and data.frame.x1 <- c(3, 1, 4)
print(x1)
## [1] 3 1 4
x2 <- c(2, 7, 1)
print(x1 + x2)
## [1] 5 8 5
future_human <- c("earther", "martian", "belter")
print(future_human)
## [1] "earther" "martian" "belter"
You could combine different type of vectors
x1_and_future_human <- c(x1, future_human)
print(x1_and_future_human)
## [1] "3" "1" "4" "earther" "martian" "belter"
We often create the initial status in the Markov model using vector
v_init <- c(0.9, 0.09, 0.01)
names(v_init) <- c("healthy", "sick", "dead")
print(v_init)
## healthy sick dead
## 0.90 0.09 0.01
print(sum(v_init))
## [1] 1
print(v_init[2])
## sick
## 0.09
print(v_init["sick"])
## sick
## 0.09
print(v_init[2:3])
## sick dead
## 0.09 0.01
print(v_init[c("sick", "dead")])
## sick dead
## 0.09 0.01
x <- matrix(c(1, 0, 0,
0.8, 0.2, 0,
0, 0, 1),
byrow = T, nrow = 3)
print(x)
## [,1] [,2] [,3]
## [1,] 1.0 0.0 0
## [2,] 0.8 0.2 0
## [3,] 0.0 0.0 1
rownames(x) <- colnames(x) <- c("healthy", "sick", "dead")
print(x)
## healthy sick dead
## healthy 1.0 0.0 0
## sick 0.8 0.2 0
## dead 0.0 0.0 1
print(x[2, 2])
## [1] 0.2
print(x["sick", "sick"])
## [1] 0.2
Matrix multiplication is often used in Markov models. For example, we can multiply the initial state with the transition matrix
v_init is not a matrix. Thus, we use t() to convert v_init into a \(1 \times 3\) matrix.%*% is the symbol for matrix multiplication in Rprint(t(v_init) %*% x)
## healthy sick dead
## [1,] 0.972 0.018 0.01
array() is more useful.array() can be seen as a data type that stores multiple matrices all at once.tr1 <- x
tr2 <- matrix(c(0.9, 0.1, 0,
0.7, 0.2, 0.1,
0, 0, 1),
byrow = T, nrow = 3,
dimnames = list(c("healthy", "sick", "dead"),
c("healthy", "sick", "dead")))
tr_array <- array(dim = c(3, 3, 2),
data = cbind(tr1, tr2),
dimnames = list(c("healthy", "sick", "dead"),
c("healthy", "sick", "dead"),
c(1, 2)))
print(tr_array)
## , , 1
##
## healthy sick dead
## healthy 1.0 0.0 0
## sick 0.8 0.2 0
## dead 0.0 0.0 1
##
## , , 2
##
## healthy sick dead
## healthy 0.9 0.1 0.0
## sick 0.7 0.2 0.1
## dead 0.0 0.0 1.0
print(tr_array[ , , 1])
## healthy sick dead
## healthy 1.0 0.0 0
## sick 0.8 0.2 0
## dead 0.0 0.0 1
print(tr_array[ , , 2])
## healthy sick dead
## healthy 0.9 0.1 0.0
## sick 0.7 0.2 0.1
## dead 0.0 0.0 1.0
print(tr_array[1, 2, 1])
## [1] 0
v_time2 <- t(v_init) %*% tr_array[ , , 1]
print(v_time2)
## healthy sick dead
## [1,] 0.972 0.018 0.01
v_time3 <- v_time2 %*% tr_array[ , , 2]
print(v_time3)
## healthy sick dead
## [1,] 0.8874 0.1008 0.0118
temp_ls <- list(future_human = future_human, v_init = v_init, tr_array = tr_array)
print(temp_ls)
## $future_human
## [1] "earther" "martian" "belter"
##
## $v_init
## healthy sick dead
## 0.90 0.09 0.01
##
## $tr_array
## , , 1
##
## healthy sick dead
## healthy 1.0 0.0 0
## sick 0.8 0.2 0
## dead 0.0 0.0 1
##
## , , 2
##
## healthy sick dead
## healthy 0.9 0.1 0.0
## sick 0.7 0.2 0.1
## dead 0.0 0.0 1.0
print(temp_ls[[1]])
## [1] "earther" "martian" "belter"
print(temp_ls$v_init)
## healthy sick dead
## 0.90 0.09 0.01
print(temp_ls$tr_array[, , 1])
## healthy sick dead
## healthy 1.0 0.0 0
## sick 0.8 0.2 0
## dead 0.0 0.0 1
data.frameR, you probably have encountered data.frame() very frequently.data.frame()profile <- data.frame(name = c("Amos", "Bobbie", "Naomi"),
human_type = c("earther", "martian", "belter"),
height = c(1.8, 2.1, 1.78))
print(profile)
## name human_type height
## 1 Amos earther 1.80
## 2 Bobbie martian 2.10
## 3 Naomi belter 1.78
data.frame() is essentially a named list of vectors with the same length.typeof(profile)
## [1] "list"
length(profile$name)
## [1] 3
length(profile$human_type)
## [1] 3
length(profile$height)
## [1] 3
data.frame()profile[profile$name == "Bobbie", ]
## name human_type height
## 2 Bobbie martian 2.1
profile[profile$name == "Bobbie", "height"]
## [1] 2.1
R data format family:
.RData, .rda, and .rds.load() and readRDS()You can save the data using save() and saveRDS().
It is common that you might encounter other data types (e.g., .csv and .txt) or even data format for other software (e.g., Stata and SAS).
To read or save other types of data, you could find the information here and here.
Components included in loops
The most used loop is for loop in many decision models. The structure follows:
for (i in c(start : end)) {
# Routine / Process
}
Example 1: Fibonacci sequence is the sum of the two preceding numbers. We start the sequence from 0 and 1. What are the first 10 Fibonacci numbers? (0 and 1 are the 1st and 2nd Fibonacci number, respectively.)
Note that we want to get all the 10 Fibonacci numbers. We need to create a vector to store all the 10 numbers.
fib_vec.fib_vec <- rep(0, 10)
fib_vec[c(1:2)] <- c(0, 1)
print(fib_vec)
## [1] 0 1 0 0 0 0 0 0 0 0
for (i in c(3 : 10)) {
fib_vec[i] <- fib_vec[i - 1] + fib_vec[i - 2]
}
print(fib_vec)
## [1] 0 1 1 2 3 5 8 13 21 34
Example 2: In year 2400, there are 3000 martians in Mars colony. The growth rate of the martian population follows this formula \(0.05 P(t) \bigl(1 - \frac{P(t)}{10000})\). What is the population size in year 2500?
# initialize a vector of population size over the next 100 years
popsize <- rep(3000, 100)
# Calculate
for (t in c(1 : 100)) {
popsize[t + 1] <- popsize[t] + 0.05 * popsize[t] * (1 - popsize[t] / 100000)
}
popdf <- data.frame(year = c(2400:2500),
popsize = popsize)
print(popdf[popdf$year == 2500, ])
## year popsize
## 101 2500 81499.86
if (condition1) {
# Execute some code
} else if (condition2) {
# Execute some code
} else {
# Execute some code
}
else if and else are not always needed.if()"yoda" == "windu"
## [1] FALSE
Jedi <- c("yoda", "windu", "kenobi")
"yoda" %in% Jedi
## [1] TRUE
if() is true.Mandalorian <- c("satine", "sabine", "jango")
x <- "yoda"
if (x %in% Jedi) {
print("May the force be with you!")
} else if (x %in% Mandalorian) {
print("This is the way!")
} else {
print("Hello, world!")
}
## [1] "May the force be with you!"
"Yoda", you get "Hello, world!"R, e.g., lm(), sum(), print(), etc.The primary components of an R function are the body(), formals(), and environment(). We often want to return the results from the function.
body(): the code inside the function.formals(): the input arguments.environment(): where the variables in the function are located.return(): return the relevant outputspeak <- function(x) {
Jedi <- c("yoda", "windu", "kenobi")
Mandalorian <- c("satine", "sabine", "jango")
membership <- "Not Jedi or Mandalorian"
if (x %in% Jedi) {
say <- "May the force be with you!"
membership <- "Jedi"
} else if (x %in% Mandalorian) {
say <- "This is the way!"
membership <- "Mandalorian"
} else {
say <- "Hello, world!"
}
return(list(say = say, membership = membership))
}
formals(speak)
## $x
body(speak)
## {
## Jedi <- c("yoda", "windu", "kenobi")
## Mandalorian <- c("satine", "sabine", "jango")
## membership <- "Not Jedi or Mandalorian"
## if (x %in% Jedi) {
## say <- "May the force be with you!"
## membership <- "Jedi"
## }
## else if (x %in% Mandalorian) {
## say <- "This is the way!"
## membership <- "Mandalorian"
## }
## else {
## say <- "Hello, world!"
## }
## return(list(say = say, membership = membership))
## }
environment(speak)
## <environment: R_GlobalEnv>
speak("jango")
## $say
## [1] "This is the way!"
##
## $membership
## [1] "Mandalorian"
Question: How would you program the Matian population growth in a function? What are the input arguments? What are the results return from the function?
x <- rnorm(1000, 0, 1) # draw 1000 samples from a normal distribution
print(mean(x))
## [1] 0.02952562
print(sd(x))
## [1] 1.001387
hist(x, freq = F, col = "gray")
library(CEAutil)
data(worldHE)
print(head(worldHE))
## Entity Code Year LEyr HealthExp Pop
## 1 Afghanistan AFG 1800 NA NA 3280000
## 2 Afghanistan AFG 1801 NA NA 3280000
## 3 Afghanistan AFG 1802 NA NA 3280000
## 4 Afghanistan AFG 1803 NA NA 3280000
## 5 Afghanistan AFG 1804 NA NA 3280000
## 6 Afghanistan AFG 1805 NA NA 3280000
worldHE2010 <- worldHE[worldHE$Year == 2010, ]
hist(worldHE2010$LEyr, main = "Life expectancy in 2010")
hist(worldHE2010$HealthExp, main = "Health expenditure in 2010 per capita")
plot(worldHE2010$HealthExp, worldHE2010$LEyr, ylim = c(70, 90),
xlab = "health expenditure", ylab = "life expectancy")
text(worldHE2010$HealthExp, worldHE2010$LEyr, labels = worldHE2010$Entity, cex = 0.8)
worldHE_US <- worldHE[worldHE$Entity == "United States" & worldHE$Year >= 1970 & worldHE$Year <= 2015, ]
# adding lines: UK's expenditure at the same time period
worldHE_UK <- worldHE[worldHE$Entity == "United Kingdom" & worldHE$Year >= 1970 & worldHE$Year <= 2015, ]
plot(worldHE_US$Year, worldHE_US$HealthExp, type = "l", col = "red",
ylab = "expenditure", xlab = "year", ylim = c(500, 10000))
lines(worldHE_UK$Year, worldHE_UK$HealthExp, col = "royalblue")
legend("topleft", legend=c("US", "UK"),
col=c("red", "royalblue"), lty = 1, cex = 0.8)
ggplot2ggplot2, dampack, and CEAutil.if(!require(ggplot2)) install.packages("ggplot2")
if(!require(devtools)) install.packages("devtools")
if(!require(remotes)) install.packages("remotes")
if(!require(dampack)) remotes::install_github("DARTH-git/dampack", dependencies = TRUE)
if(!require(CEAutil)) remotes::install_github("syzoekao/CEAutil", dependencies = TRUE)
For a simple decision tree, we can draw the tree easily. As the decision tree grows, there is software helping you draw the tree such as TreeAge and Amua. In this class, we use Amua because it is free. We will show how to use Amua to build a decision tree and export the tree to R script for CEA analysis.
Follow the instruction here to install Amua. Be aware of the difference between Mac and Windows users! After you install Amua, remember where you store the software.
Amua.jar.temp_Export/..R files: main.R and functions.R.main.R.Dracula is preparing for a spring break party tonight. But there’s one final decision that Dracula is struggling with. Being a vampire, he intends to bite and suck the blood of some of his guests at the party (about 25% of them, by his best estimate). While being bitten by a vampire won’t turn his guests into vampires or zombies, like some horror movies might suggest, there is a 50% chance that a vampire bite results in a rather severe bacterial infection, of which 66% of cases require hospitalization for an average of 1 night.
Being a gracious host, Dracula is considering different ways of administering antibiotic prophylaxis to his guests to reduce their risk of infection. One option is to administer the antibiotics to his victim‐guests just before he bites them – this would reduce the risk of a vampire bite infection by 20%. However, the antibiotics can be even more effective, reducing the risk of infection by 90%, if administered at least 30 minutes before being bitten. To achieve this, Dracula is considering putting the antibiotics into the drinks served at the party to ensure that his guests are all properly dosed before he bites his victims. But this means that all his guests would be exposed to the antibiotics (not just those he intends to bite), and he knows that about 5% of people are severely allergic to these antibiotics and would require immediate hospitalization if exposed. Dracula is therefore also considering not administering antibiotic prophylaxis at all to avoid this harm.
However, all the healthy blood doesn’t come without cost. This party will cost Dracula $1000 for a total of 200 guests (an average of $5 per guest). In addition, Dracula expects the cost of antibiotic to be $10 for each guest he bites. If Dracula decides to administer the antibiotics in all the drinks served at the party, the total cost of antibioitics is expected to be $100 (Dracula gets discount for buying a large batch of antibiotics!). Also, Dracula is willing to pay for the cost of hospitalization for any guest who experience bacteiral infection due to his bites because he feels responsible. The cost of hospitalization per person per night is $500. Dracula expects to get an average of 470 mL healthy blood by biting a guest. However, if the guest ends up hospitalized due to bacterial infection, the healthy blood that Dracula can get is reduced by 10%.
Dracula hopes you can help him determine which strategy he should choose.
You can build a decision tree via Amua.
if(!require("remotes")) install.packages("remotes")
if(!require("CEAutil")) remotes::install_github("syzoekao/CEAutil", dependencies = TRUE)
library(CEAutil)
parse_amua_tree():This function only takes an input argument, the path to the main.R file of the Amua decision model. The function returns a list of outputs. Output 1 param_ls is a list of input parameters with basecase values used in Amua. Output 2 treefunc is the R code of the Amua decision model formatted as text.
treetxt <- parse_amua_tree("AmuaExample/DraculaParty_Export/main.R")
dectree_wrapper():We use the two outputs from the parse_amua_tree() as the input arguments in the dectree_wrapper(). The input arguments in the wrapper function include params_basecase, treefunc, and popsize. params_basecase takes a list of named input parameters. treefunc takes the text file organized by the parse_amua_tree(). popsize is defaulted as 1 but you could change your population size.
param_ls <- treetxt[["param_ls"]]
treefunc <- treetxt[["treefunc"]]
Example 1. Dracula has been starved over the long cruel winter in Minnesota. The Spring break is the first time that his guests are willing to come to his party in several months. Therefore, Dracula is going to seize the chance to bite as many guests as possible. The probability that he bites a guest is now increased to 50%. What are the cost and effectiveness of each strategy?
Example 2. The cost of hospitalization due to bacterial infection varies from guest to guest depending on the healthcare that a guest has. Overall, the cost of hospitalization has a mean of $500 with a standard deviation $300 following a gamma distribution. What are the mean cost and effectiveness of the party across all 200 guests?